perm filename AA[S84,JMC] blob sn#758267 filedate 1984-06-08 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00002 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	aa[s84,jmc]		The limitations of the analysis of algorithms
C00010 ENDMK
C⊗;
aa[s84,jmc]		The limitations of the analysis of algorithms

	In connection with proposals to add another senior faculty member
in this area, I found myself motivated to make my objections precise.

	Stanford is well supplied with such faculty, and certain criteria
of merit would lead to entirely filling our department with them.

	Namely, analysis of algorithms offers the best opportunities
of any field of computer science of doing interesting mathematical work
that is relevant, at least metaphorically, to important computer science.
For this reason it attracts people of strong mathematical interests
and abilities.

	The crack about "metaphorically", however, goes to the root of
the problem.  Often what is done mathematically is relevant to only
a small part of the problem.  With present mathematical techniques,
it is difficult to see how to do much more, and it is hard to see
what advances would make more of the problems mathematically accessible.

	The one clear example that I can cite is alpha-beta pruning.
I invented the general idea, and the name "alpha-beta heuristic" in
1956 in reaction to a description by Alex Bernstein about how he
planned to implement his chess program.  He planned to choose 7
plausible moves in each position, 7 replies to each for the opponent,
and so on to 4 plies leading to 7↑4 end positions.  He could not
be convinced at that time that this was suboptimal and that many
of the branches could be cut off.  In fact his running program
did not prune, and this contributed to its weak performance.  From
the AI point of view, this illustrates the fact that Bernstein,
like everyone else, could not observe his own mental processes
very well.  Since he is an excellent chess player, he surely
pruned when he was considering a variation.

	My own formulation of alpha-beta was, from the AA point of
view, contaminated by an additional heuristic that spoiled the
nice mathematical properties.  Namely, I proposed that each non-terminal
position be evaluated by an optimistic and a pessimistic evaluation.
Alpha cutoff would take place if the optimistic evaluation was less
than alpha and beta cutoff would take place if the pessimistic
evaluation was greater than beta.  This reduces to the usual
alpha-beta pruning provided the optimistic evaluation is taken
to be ∞ and the pessimistic evaluation to be -∞ for non-terminal
positions.  Using the optimistic and pessimistic evaluations
obscured the fact that alpha-beta was uniformly better than
minimax in that it got the same answer with less work.

	The now standard alpha-beta pruning was discovered by
Timothy Hart and Roland Silver in the course of a discussion
of whether a Kalah program should use alpha-beta.  They proved
that it was better than minimax and also showed that it examined
the square root of the number of nodes examined by minimax
provided the moves were optimally ordered.

	Neither I nor they published a formal paper about alpha-beta.
I don't know why they didn't, but my reason was that alpha-beta
was for me only one heuristic to be included in a game-playing
program, and I was hoping for enough heuristics to get a really
good chess player before publishing.  I didn't know how few and
far between good results in AI were.

	Don Knuth, followed by others, subsequently published on the AA
properties of alpha-beta.  His primary approach was to assume a
random distribution of values at the leaves of the move tree and
then determine expected improvement given by alpha-beta.  Here is
the fault of analysis of algorithms.

	Namely, this is all that can reasonably be done using a
general game-tree model, but the results are entirely irrelevant
for real game programs.  This because the core of a game-playing
program using alpha-beta is the move ordering program.  If the
move ordering program works very well, and they usually work
quite well, the number of moves is close to n↑(d/2), and this
is enormously better than the result obtained by AA methods
for random ordering.  However, the move ordering heuristics
have no obvious mathematical properties on which AA can get
a grip.  This is why AA is doomed to irrelevance in such matters.

	Worse, the tree model of a game vanishes if the killer
heuristic, one of more useful move orderers, is used.  The killer
heuristic puts a move that was successful in one branch of the
tree on a list to be tried early in parallel branches of the
tree.  Thus if P-N6 is successful in one position, perhaps
because it threatens some combination, it is tried in parallel
positions.  However, the move tree model of a game makes no
provision for the identification of moves in different positions.
There is no reason to suppose that the P-N6 will be the fifth
move generated in one position if it happens to be the fifth
move in another.